biplotEZ

User-friendly biplots in R



Centre for Multi-Dimensional Data Visualisation (MuViSU)
muvisu@sun.ac.za



StatCon2023

Presenters

Dr Raeesa Ganey (Wits)

 

Dr Johané Nienkemper-Swanepoel (SU)

Meet the Team (SU)

Prof. Niël le Roux

Prof. Sugnet Lubbe

Dr Carel van der Merwe

Meet the Team (SU students)

Zoë-Mae Adams (PhD)

 

Ruan Buys (MComm)

Outline

  • Session 1: Introduction to biplots
    • Introduction to biplots
    • PCA biplot
    • Getting started: biplotEZ
  • Session 2: Other types of biplots
    • Canonical Variate Analysis biplots
    • CVA biplot with biplotEZ

Introduction

  • The biplot is a powerful and very useful data visualisation tool. Biplots make information in a table of data become transparent, revealing the main structures in the data in a methodical way, for example patterns of correlations between variables or similarities between the observations.
  • Consider an \(n\) x \(p\) data matrix containing numerical information on \(p\) variables for each of \(n\) samples.
  • The \(n\) samples can be represented in a two-dimensional scatter diagram, where the coordinate axes can represent any two variables.
  • Samples can also be represented in a three-dimensional diagram developed from Cartesian geometry and conceptually in multidimensional forms in which there are many coordinate axes to represent the \(n\) samples.

Scatterplot of 2 variables

Biplots

  • Although two-dimensional scatter diagrams are crucial for presenting data, multidimensional scatter diagrams tend not to be.
  • Statisticians therefore, developed methods for approximating multidimensional scatter diagrams into a biplot.
  • A biplot is a generalisation of a two-dimensional scatter diagram of data that exists in a higher dimensional space, where information on both samples and variables can be displayed graphically.
  • Gabriel (1971) introduced the biplot in which the samples are represented as \(n\) points and the variables are represented as \(p\) vectors relative to the same axes and origin.

Iris data

library(tibble)
tibble(iris)
# # A tibble: 150 × 5
#    Sepal.Length Sepal.Width Petal.Length
#           <dbl>       <dbl>        <dbl>
#  1          5.1         3.5          1.4
#  2          4.9         3            1.4
#  3          4.7         3.2          1.3
#  4          4.6         3.1          1.5
#  5          5           3.6          1.4
#  6          5.4         3.9          1.7
#  7          4.6         3.4          1.4
#  8          5           3.4          1.5
#  9          4.4         2.9          1.4
# 10          4.9         3.1          1.5
# # ℹ 140 more rows
# # ℹ 2 more variables: Petal.Width <dbl>,
# #   Species <fct>
summary(iris)
#   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
#  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
#  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
#  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
#  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
#  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
#  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
#        Species  
#  setosa    :50  
#  versicolor:50  
#  virginica :50  
#                 
#                 
# 

Biplot of Iris data

library(biplotEZ)
biplot(iris[,1:4],iris[,5]) |> PCA() |> plot()

Biplots

Considerations

  • the types of variables (quantitative, qualitative, ordinal, etc.);
  • the method used for displaying samples;
  • what the biplot will display, either for predicting a sample point or interpolating a new sample point.

For now the focus will be on (1) continuous data, (2) the method used to display samples through approximations of the data, and (3) predicting samples.

Constructing a biplot

Data: \({\bf{X}}\)

#          X1       X2        X3
# 1  5.418840 5.054240  8.711160
# 2  3.129920 1.783160  3.385920
# 3  6.128080 2.173200  8.173560
# 4  6.781120 4.753280  8.731640
# 5  7.346560 5.893200 11.303040
# 6  7.208200 3.744000 10.075760
# 7  7.039440 5.213640  8.608840
# 8  5.465720 4.492640  5.596520
# 9  7.723240 4.708120 11.357480
# 10 7.109560 4.987520  7.732840
# 11 8.135800 4.392000  9.264840
# 12 6.287480 5.908720  7.488240
# 13 4.648880 7.198280  8.573720
# 14 5.798600 6.120080  8.254840
# 15 8.084560 3.234840  8.966560
# 16 6.157773 5.743455  9.899045
# 17 3.556727 2.026318  3.847636
# 18 6.963727 2.469545  9.288136
# 19 7.705818 5.401455  9.922318
# 20 8.348364 6.696818 12.844364
# 21 8.191136 4.254545 11.449727
# 22 7.999364 5.924591  9.782773
# 23 6.211045 5.105273  6.359682
# 24 8.776409 5.350136 12.906227
# 25 8.079045 5.667636  8.787318

Constructing a biplot

Geometrically the rows of \({\bf{X}}\) are given as coordinates of \(n\) samples in the \(p\)-dimensional space \(\mathbb{R}^p\).

The aim is to seek an \(r\)-dimensional plane that contains the points whose coordinates are given by the rows of \({\bf{\hat{X}}}_{[r]}\) which minimises a least squares criterion given by, \[\begin{equation} || {\bf{X}} - {\bf{\hat{X}}}_{[r]}||^2 = tr\{({\bf{X}} - {\bf{\hat{X}}}_{[r]})({\bf{X}} - {\bf{\hat{X}}}_{[r]})'\}. \end{equation}\]

The best approximation that minimises the least squares criterion is the \(r\)-dimensional Eckart-Young approximation given by \({\bf{\hat{X}}}_{[r]} = {\bf{U}} {\bf{D}}_{[r]} {\bf{V}}'\)

Representing samples

A standard result when \(r=2\) from is that the row vectors of \({\bf{\hat{X}}}_{[2]}\) are the orthogonal projections of the corresponding row vectors of \({\bf{X}}\) onto the column space of \({\bf{V}}_2\). The projections are therefore,

\[\begin{equation} {\bf{X}} {\bf{V}}_2. \end{equation}\] These projections are also known as the first two principal components.

Representing variables

The columns of \({\bf{X}}\) are approximated by the first two rows of \({\bf{V}}\), which now represent the axes for each variable.

Calibrated biplot axes

We have constructed a biplot, but the variables represented by the vectors (arrows) have no calibration.

That meaning, there are no markers on the vectors representing the variables analogous to ordinary scatterplots.

To construct a biplot axis with relevant markers for a variable, a \((p-1)\)-dimensional hyperplane \(\mathscr{N}\) perpendicular to the Cartesian axis is required.

First plane

From the data, \(p = 3\) therefore, a two-dimensional hyperplane is constructed perpendicular to \(X_1\) through a specific value of \(X_1\), say \(\mu\).

The intersection of \(\mathscr{L}\) and \(\mathscr{N}\) is an \((r-1)\)-dimensional intersection space, which in this case will be indicated by a line. All the points on this intersection line in \(\mathscr{L}\) will predict the value for \(\mu\) for the \(X_1\)-axis.

Second plane

The plane \(\mathscr{N}\) is shifted orthogonally through another value on \(X_1\) and all the points on the intersection line of \(\mathscr{L}\) and \(\mathscr{N}\) will predict that value that the plane goes through.

Intersection lines

As the plane \(\mathscr{N}\) is shifted along the \(X_1\)-axis, a series of parallel intersection spaces is obtained.

Any line passing through the origin will pass through these intersection spaces and can be used as an axis fitted with markers according to the value associated with the particular intersection space.

To facilitate orthogonal projection onto the axis, similar to an ordinary scatterplot, the line orthogonal to these intersection spaces is chosen.

PCA Biplot

biplot(X) |> PCA() |> plot()

Installing biplotEZ

# Install from CRAN
install.packages("biplotEZ")
library(biplotEZ)

# Install the development version from GitHub
library(devtools)
install_github("MuViSU/biplotEZ_beta")
library(biplotEZ)

# Getting help                 
help(package="biplotEZ")

# helpfiles for specific functions
?biplotEZ::biplot
?biplotEZ::PCA
# ...

General notes

  • Use the native R pipe: |>
  • Since %>% is not required, you do not have to load:
library(magrittr)
  • To set |> as your default, change the keyboard shortcut Ctrl-Shift-M:

Functions in current version

  • Use the following function to obtain a list of available functions in a package: \(~\)
lsf.str("package:biplotEZ")
# alpha.bags : function (bp, alpha = 0.95, which = NULL, col = ez.col, lty = 1, lwd = 1, 
#     max = 2500, trace = TRUE)  
# axes : function (bp, X.names = colnames(bp$X), which = 1:bp$p, col = grey(0.7), 
#     lwd = 1, lty = 1, label.dir = "Orthog", label.col = col, label.cex = 0.75, 
#     label.dist = 0, ticks = 5, tick.col = col, tick.size = 1, tick.label = TRUE, 
#     tick.label.col = tick.col, tick.label.cex = 0.6, tick.label.side = "left", 
#     tick.label.offset = 0.5, tick.label.pos = 1, predict.col = col, predict.lwd = lwd, 
#     predict.lty = lty, ax.names = X.names, orthogx = 0, orthogy = 0)  
# biplot : function (data, group.aes = NULL, center = TRUE, scaled = FALSE, Title = NULL)  
# concentration.ellipse : function (bp, df = 2, kappa = NULL, which = NULL, alpha = 0.95, col = ez.col, 
#     lty = 1, lwd = 1, alpha.transparency = 0.5, trace = TRUE)  
# CVA : function (bp, dim.biplot = c(2, 1, 3), e.vects = 1:ncol(bp$X), group.aes = bp$group.aes, 
#     weightedCVA = "weighted", ...)  
# legend.type : function (bp, samples = FALSE, means = FALSE, bags = FALSE, ellipses = FALSE, 
#     new = FALSE, ...)  
# PCA : function (bp, dim.biplot = c(2, 1, 3), e.vects = 1:ncol(bp$X), group.aes = NULL, 
#     correlation.biplot = FALSE, ...)  
# samples : function (bp, col = ez.col, pch = 3, cex = 1, label = FALSE, label.cex = 0.75, 
#     label.side = "bottom", connected = FALSE, alpha = 1)

Flow of functions

  • Start with biplot().
  • Use the preferred biplot: PCA() or CVA(), available in current version.
    • Format samples() and axes().
    • Add alpha.bags() or concentration.ellipse().
    • Add a legend.type().
  • To visualise, end with plot().

Use pipe ( |> ) between each subsequent function. The order of samples(), axes(), alpha.bags(), concentration.ellipse() and legend.type() can be interchanged.

biplot() - arguments

# function (data, group.aes = NULL, center = TRUE, scaled = FALSE, 
#     Title = NULL) 
# NULL
Argument Description
data A dataframe or matrix containing all variables the user wants to analyse.
group.aes Variable from the data to be used as a grouping variable.
center TRUE or FALSE
scaled TRUE or FALSE
Title Title of the biplot to be rendered.

biplot() - print

out <- biplot(data = iris)
out
# Object of class biplot, based on 150 samples and 5 variables.

biplot() - values

out <- biplot(iris[,1:4], group.aes = iris[,5])
str(out)
# List of 14
#  $ X        : num [1:150, 1:4] -0.743 -0.943 -1.143 -1.243 -0.843 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#   ..- attr(*, "scaled:center")= Named num [1:4] 5.84 3.06 3.76 1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ Xcat     : NULL
#  $ raw.X    :'data.frame':    150 obs. of  4 variables:
#   ..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#   ..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#   ..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#   ..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#  $ na.action: NULL
#  $ center   : logi TRUE
#  $ scaled   : logi FALSE
#  $ means    : Named num [1:4] 5.84 3.06 3.76 1.2
#   ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ sd       : num [1:4] 1 1 1 1
#  $ n        : int 150
#  $ p        : int 4
#  $ group.aes: Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
#  $ g.names  : chr [1:3] "setosa" "versicolor" "virginica"
#  $ g        : int 3
#  $ Title    : NULL
#  - attr(*, "class")= chr "biplot"

PCA() - arguments

# function (bp, dim.biplot = c(2, 1, 3), e.vects = 1:ncol(bp$X), 
#     group.aes = NULL, correlation.biplot = FALSE, ...) 
# NULL
Argument Description
bp Object of class biplot.
dim.biplot Dimension of the biplot. Only values 1, 2 and 3 are accepted, with default 2. Only 2D in current version.
e.vects Which eigenvectors (principal components) to extract, with default 1:dim.biplot.

PCA() - arguments

# function (bp, dim.biplot = c(2, 1, 3), e.vects = 1:ncol(bp$X), 
#     group.aes = NULL, correlation.biplot = FALSE, ...) 
# NULL
Argument Description
group.aes Optional vector of the same length as the number of rows in the data matrix for differentiated aesthetics for samples.
correlation.biplot If FALSE, the distances between sample points are optimally approximated in the biplot. If TRUE, the correlations between variables are optimally approximated by the cosine of the angles between axes. Deafult is FALSE.

PCA() - arguments

out <- biplot(iris[,1:4]) |> PCA()
out
# Object of class biplot, based on 150 samples and 4 variables.

PCA() - values

str(out)
# List of 18
#  $ X          : num [1:150, 1:4] -0.743 -0.943 -1.143 -1.243 -0.843 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#   ..- attr(*, "scaled:center")= Named num [1:4] 5.84 3.06 3.76 1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ Xcat       : NULL
#  $ raw.X      :'data.frame':  150 obs. of  4 variables:
#   ..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#   ..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#   ..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#   ..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#  $ na.action  : NULL
#  $ center     : logi TRUE
#  $ scaled     : logi FALSE
#  $ means      : Named num [1:4] 5.84 3.06 3.76 1.2
#   ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ sd         : num [1:4] 1 1 1 1
#  $ n          : int 150
#  $ p          : int 4
#  $ group.aes  : Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
#  $ g.names    : chr "1"
#  $ g          : int 1
#  $ Title      : NULL
#  $ Z          : num [1:150, 1:2] -2.68 -2.71 -2.89 -2.75 -2.73 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : NULL
#  $ Vr         : num [1:4, 1:2] 0.3614 -0.0845 0.8567 0.3583 -0.6566 ...
#  $ Xhat       : num [1:150, 1:4] 5.08 4.75 4.7 4.64 5.07 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : NULL
#   ..- attr(*, "scaled:center")= Named num [1:4] -5.84 -3.06 -3.76 -1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ ax.one.unit: num [1:4, 1:2] 0.643 -0.156 1.121 2.672 -1.169 ...
#  - attr(*, "class")= chr [1:2] "biplot" "PCA"

PCA() with plot() - simple

biplot(iris[,1:4]) |> PCA() |> plot()

PCA() with plot() - groups

biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> plot()

samples() - arguments

# function (bp, col = ez.col, pch = 3, cex = 1, label = FALSE, 
#     label.cex = 0.75, label.side = "bottom", connected = FALSE, 
#     alpha = 1) 
# NULL
Argument Description
bp Object of class biplot.
col Sample colour.
pch Sample plotting character.
cex Sample character expansion.
label Logical argument, whether samples should be labelled or not, with default FALSE.

samples() - arguments

# function (bp, col = ez.col, pch = 3, cex = 1, label = FALSE, 
#     label.cex = 0.75, label.side = "bottom", connected = FALSE, 
#     alpha = 1) 
# NULL
Argument Description
label.cex Label text expansion.
label.side Side of the plotting character where label appears, with default “bottom”.
connected Logical argument, whether samples are connected in order of rows of data matrix, with default “FALSE”. Not in current version.
alpha Opacity, with default 1. Not in current version.

samples() - values

out <- biplot(iris[,1:4]) |> PCA() |> samples()
str(out)
# List of 19
#  $ X          : num [1:150, 1:4] -0.743 -0.943 -1.143 -1.243 -0.843 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#   ..- attr(*, "scaled:center")= Named num [1:4] 5.84 3.06 3.76 1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ Xcat       : NULL
#  $ raw.X      :'data.frame':  150 obs. of  4 variables:
#   ..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#   ..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#   ..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#   ..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#  $ na.action  : NULL
#  $ center     : logi TRUE
#  $ scaled     : logi FALSE
#  $ means      : Named num [1:4] 5.84 3.06 3.76 1.2
#   ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ sd         : num [1:4] 1 1 1 1
#  $ n          : int 150
#  $ p          : int 4
#  $ group.aes  : Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
#  $ g.names    : chr "1"
#  $ g          : int 1
#  $ Title      : NULL
#  $ Z          : num [1:150, 1:2] -2.68 -2.71 -2.89 -2.75 -2.73 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : NULL
#  $ Vr         : num [1:4, 1:2] 0.3614 -0.0845 0.8567 0.3583 -0.6566 ...
#  $ Xhat       : num [1:150, 1:4] 5.08 4.75 4.7 4.64 5.07 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : NULL
#   ..- attr(*, "scaled:center")= Named num [1:4] -5.84 -3.06 -3.76 -1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ ax.one.unit: num [1:4, 1:2] 0.643 -0.156 1.121 2.672 -1.169 ...
#  $ samples    :List of 9
#   ..$ col       : chr "blue"
#   ..$ pch       : num 3
#   ..$ cex       : num 1
#   ..$ label     : logi FALSE
#   ..$ label.cex : num 0.75
#   ..$ label.side: chr "bottom"
#   ..$ connected : logi FALSE
#   ..$ alpha     : num 1
#   ..$ g         : int 1
#  - attr(*, "class")= chr [1:2] "biplot" "PCA"

PCA(), samples() with plot()

#changing the colour, plotting characters and size of the samples
biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> samples(col = "purple", 
 pch = c(15,1,17), cex = 1.2) |> plot()

PCA(), samples() with plot()

#changing the grouping colours
#printing the sample names and changing the size
#omitting the plotting character
biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> samples(label = TRUE, pch = "", 
  col = c("purple","magenta","blue"), label.cex=1) |> plot()

PCA(), samples() with plot()

#changing the side of the label
biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> samples(label = TRUE, pch = "", 
  col = c("purple","magenta","blue"), label.side = "top") |> plot()

PCA(), samples() with plot()

#changing the row names
X <- iris[,1:4]
rownames(X) <- paste("s",rownames(iris),sep=":")
biplot(X, group.aes = iris[,5]) |> PCA() |> samples(label = TRUE, pch = "", 
  col = c("purple","magenta","blue")) |> plot()

axes() - arguments

# function (bp, X.names = colnames(bp$X), which = 1:bp$p, col = grey(0.7), 
#     lwd = 1, lty = 1, label.dir = "Orthog", label.col = col, 
#     label.cex = 0.75, label.dist = 0, ticks = 5, tick.col = col, 
#     tick.size = 1, tick.label = TRUE, tick.label.col = tick.col, 
#     tick.label.cex = 0.6, tick.label.side = "left", tick.label.offset = 0.5, 
#     tick.label.pos = 1, predict.col = col, predict.lwd = lwd, 
#     predict.lty = lty, ax.names = X.names, orthogx = 0, orthogy = 0) 
# NULL
Argument Description
bp Object of class biplot.
X.names Refers to the column names of bp to specify which axes to label.
which Column numbers for which axes should be constructed.
col Axis colours.
lwd Axis line widths.

axes() - arguments

# function (bp, X.names = colnames(bp$X), which = 1:bp$p, col = grey(0.7), 
#     lwd = 1, lty = 1, label.dir = "Orthog", label.col = col, 
#     label.cex = 0.75, label.dist = 0, ticks = 5, tick.col = col, 
#     tick.size = 1, tick.label = TRUE, tick.label.col = tick.col, 
#     tick.label.cex = 0.6, tick.label.side = "left", tick.label.offset = 0.5, 
#     tick.label.pos = 1, predict.col = col, predict.lwd = lwd, 
#     predict.lty = lty, ax.names = X.names, orthogx = 0, orthogy = 0) 
# NULL
Argument Description
lty Axis line types.
label.dir Direction of the axis label. “Orthog”, “Hor” or “Paral”, with default “Orthog”.
label.col Colour of the axis label.
label.cex Character expansion of the axis label.
label.dist Distance of the axis label from the plot.

axes() - arguments

# function (bp, X.names = colnames(bp$X), which = 1:bp$p, col = grey(0.7), 
#     lwd = 1, lty = 1, label.dir = "Orthog", label.col = col, 
#     label.cex = 0.75, label.dist = 0, ticks = 5, tick.col = col, 
#     tick.size = 1, tick.label = TRUE, tick.label.col = tick.col, 
#     tick.label.cex = 0.6, tick.label.side = "left", tick.label.offset = 0.5, 
#     tick.label.pos = 1, predict.col = col, predict.lwd = lwd, 
#     predict.lty = lty, ax.names = X.names, orthogx = 0, orthogy = 0) 
# NULL
Argument Description
ticks Indicator of the number of tick marks on each axis, with default 5.
tick.col Colour of tick marks.
tick.size Size of tick marks.
tick.label Logical whether tick marks should be labelled, with default TRUE.
tick.label.col Colour of tick mark labels.

axes() - arguments

# function (bp, X.names = colnames(bp$X), which = 1:bp$p, col = grey(0.7), 
#     lwd = 1, lty = 1, label.dir = "Orthog", label.col = col, 
#     label.cex = 0.75, label.dist = 0, ticks = 5, tick.col = col, 
#     tick.size = 1, tick.label = TRUE, tick.label.col = tick.col, 
#     tick.label.cex = 0.6, tick.label.side = "left", tick.label.offset = 0.5, 
#     tick.label.pos = 1, predict.col = col, predict.lwd = lwd, 
#     predict.lty = lty, ax.names = X.names, orthogx = 0, orthogy = 0) 
# NULL
Argument Description
tick.label.cex Character expansion of of tick mark labels.
tick.label.side On which side of tick marks label should appear, with default “left”.
tick.label.offset Offset for position of tick mark labels.
tick.label.pos pos argument for text function for tick mark labels, with default 1.
predict.col Colour of lines showing orthogonal projection onto axes. Not in current version.

axes() - arguments

# function (bp, X.names = colnames(bp$X), which = 1:bp$p, col = grey(0.7), 
#     lwd = 1, lty = 1, label.dir = "Orthog", label.col = col, 
#     label.cex = 0.75, label.dist = 0, ticks = 5, tick.col = col, 
#     tick.size = 1, tick.label = TRUE, tick.label.col = tick.col, 
#     tick.label.cex = 0.6, tick.label.side = "left", tick.label.offset = 0.5, 
#     tick.label.pos = 1, predict.col = col, predict.lwd = lwd, 
#     predict.lty = lty, ax.names = X.names, orthogx = 0, orthogy = 0) 
# NULL
Argument Description
predict.lwd Line width of lines showing orthogonal projection onto axes. Not in current version.
predict.lty Line type of lines showing orthogonal projection onto axes. Not in current version.
ax.names User specified alternative names for variables.
orthogx Horizontal translation.
orthogy Vertical translation.

axes() - values

out <- biplot(iris[,1:4]) |> PCA() |> axes()
str(out)
# List of 19
#  $ X          : num [1:150, 1:4] -0.743 -0.943 -1.143 -1.243 -0.843 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#   ..- attr(*, "scaled:center")= Named num [1:4] 5.84 3.06 3.76 1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ Xcat       : NULL
#  $ raw.X      :'data.frame':  150 obs. of  4 variables:
#   ..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#   ..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#   ..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#   ..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#  $ na.action  : NULL
#  $ center     : logi TRUE
#  $ scaled     : logi FALSE
#  $ means      : Named num [1:4] 5.84 3.06 3.76 1.2
#   ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ sd         : num [1:4] 1 1 1 1
#  $ n          : int 150
#  $ p          : int 4
#  $ group.aes  : Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
#  $ g.names    : chr "1"
#  $ g          : int 1
#  $ Title      : NULL
#  $ Z          : num [1:150, 1:2] -2.68 -2.71 -2.89 -2.75 -2.73 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : NULL
#  $ Vr         : num [1:4, 1:2] 0.3614 -0.0845 0.8567 0.3583 -0.6566 ...
#  $ Xhat       : num [1:150, 1:4] 5.08 4.75 4.7 4.64 5.07 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : NULL
#   ..- attr(*, "scaled:center")= Named num [1:4] -5.84 -3.06 -3.76 -1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ ax.one.unit: num [1:4, 1:2] 0.643 -0.156 1.121 2.672 -1.169 ...
#  $ axes       :List of 23
#   ..$ which            : int [1:4] 1 2 3 4
#   ..$ col              : chr [1:4] "#B3B3B3" "#B3B3B3" "#B3B3B3" "#B3B3B3"
#   ..$ lwd              : num [1:4] 1 1 1 1
#   ..$ lty              : num [1:4] 1 1 1 1
#   ..$ label.dir        : chr "Orthog"
#   ..$ label.col        : chr [1:4] "#B3B3B3" "#B3B3B3" "#B3B3B3" "#B3B3B3"
#   ..$ label.cex        : num [1:4] 0.75 0.75 0.75 0.75
#   ..$ label.dist       : num [1:4] 0 0 0 0
#   ..$ ticks            : num [1:4] 5 5 5 5
#   ..$ tick.col         : chr [1:4] "#B3B3B3" "#B3B3B3" "#B3B3B3" "#B3B3B3"
#   ..$ tick.size        : num [1:4] 1 1 1 1
#   ..$ tick.label       : logi [1:4] TRUE TRUE TRUE TRUE
#   ..$ tick.label.col   : chr [1:4] "#B3B3B3" "#B3B3B3" "#B3B3B3" "#B3B3B3"
#   ..$ tick.label.cex   : num [1:4] 0.6 0.6 0.6 0.6
#   ..$ tick.label.side  : chr [1:4] "left" "left" "left" "left"
#   ..$ tick.label.offset: num [1:4] 0.5 0.5 0.5 0.5
#   ..$ tick.label.pos   : num [1:4] 1 1 1 1
#   ..$ predict.col      : chr [1:4] "#B3B3B3" "#B3B3B3" "#B3B3B3" "#B3B3B3"
#   ..$ predict.lty      : num [1:4] 1 1 1 1
#   ..$ predict.lwd      : num [1:4] 1 1 1 1
#   ..$ names            : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#   ..$ orthogx          : num [1:4] 0 0 0 0
#   ..$ orthogy          : num [1:4] 0 0 0 0
#  - attr(*, "class")= chr [1:2] "biplot" "PCA"

PCA(), axes() with plot()

#changing axes colour
biplot(iris[,1:4]) |> PCA() |> axes(col = "purple") |> plot()

PCA(), axes() with plot()

#displaying selected column names
biplot(iris[,1:4]) |> PCA() |> axes(col = "purple", X.names = c("Sepal.Length",
  "Petal.Width","Sepal.Width")) |> plot()

PCA(), axes() with plot()

#displaying selected axes
biplot(iris[,1:4]) |> PCA() |> axes(which = c(1,2,3), col = "black") |> plot()

PCA(), axes() with plot()

#displaying selected axes
#changing label direction and distance from plot
biplot(iris[,1:4]) |> PCA() |> axes(which = c(1,2,4), col = "black", 
  label.dir = "Paral", label.dist = 1) |> plot()

PCA(), axes() with plot()

#changing axes line width and axes label size
#changing tick colour and size
biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> axes(col = "black", tick.col = 
  "black", tick.size = 1.2, tick.label.cex = 0.7, lwd = 1.5, label.cex = 1) |> plot()

PCA(), axes() with plot()

#changing tick colour, size and number of ticks
biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> axes(col = "black", tick.col = 
  "black", tick.size = 1.2, ticks = 7, tick.label.col = "red", lwd = 1.2) |> plot()

PCA(), axes() with plot()

#changing tick mark positions
biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> axes(col = "black", tick.col = 
  "black", tick.size = 1.2, ticks = 5, tick.label.pos = c(2,2,3,1), tick.label.offset = 
  c(0.4,0.4,0.4,0.8), lwd = 1.2) |> plot()

PCA(), axes() with plot()

#changing axes names
biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> axes(col = "black", tick.col = 
  "black", tick.size = 1.2, lwd = 1.2, ticks = 7, tick.label.pos = c(2,2,3,1), 
  tick.label.offset = c(0.4,0.4,0.4,0.8), ax.names = c("A", "B", "","D"), 
  label.dist = 1, label.dir = "Hor") |> plot()

PCA(), axes() with plot()

#suppressing tick labels
biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> axes(col = "black", tick.col = 
  "black", tick.size = 1.2, lwd = 1.2, ticks = 7, tick.label = FALSE) |> plot()

PCA(), axes() with plot()

#horizontal translation of axes
biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> axes(col = "black", tick.col = 
  "black", tick.label.pos = c(2,2,3,1), tick.label.offset = c(0.4,0.4,0.4,0.8),
  orthogx = 2) |> plot()

PCA(), axes() with plot()

#vertical translation of axes
biplot(iris[,1:4], group.aes = iris[,5]) |> PCA() |> axes(col = "black", tick.col = 
  "black", tick.label.pos = c(2,2,3,1), tick.label.offset = c(0.4,0.4,0.4,0.8),
  orthogy = -2) |> plot()

alpha.bags() - arguments

# function (bp, alpha = 0.95, which = NULL, col = ez.col, lty = 1, 
#     lwd = 1, max = 2500, trace = TRUE) 
# NULL
Argument Description
bp Object of class biplot.
alpha Value between 0 and 1 to determine coverage of bag, with default 0.95.
which Which groups or classes to be fitted with alpha bags.
col Vector of colours for the alpha bags.

alpha.bags() - arguments

# function (bp, alpha = 0.95, which = NULL, col = ez.col, lty = 1, 
#     lwd = 1, max = 2500, trace = TRUE) 
# NULL
Argument Description
lty Vector of line types for the alpha bags.
lwd Vector of line widths for the alpha bags.
max Maximum number of samples to include in alpha bag calculation, with default 2500.

alpha.bags() - values

out <- biplot (iris[,1:4],group.aes = iris[,5]) |> PCA() |> alpha.bags(alpha = 0.95)
# Computing 0.95 -bag for setosa 
# Computing 0.95 -bag for versicolor 
# Computing 0.95 -bag for virginica
str(out)
# List of 20
#  $ X            : num [1:150, 1:4] -0.743 -0.943 -1.143 -1.243 -0.843 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#   ..- attr(*, "scaled:center")= Named num [1:4] 5.84 3.06 3.76 1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ Xcat         : NULL
#  $ raw.X        :'data.frame':    150 obs. of  4 variables:
#   ..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#   ..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#   ..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#   ..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#  $ na.action    : NULL
#  $ center       : logi TRUE
#  $ scaled       : logi FALSE
#  $ means        : Named num [1:4] 5.84 3.06 3.76 1.2
#   ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ sd           : num [1:4] 1 1 1 1
#  $ n            : int 150
#  $ p            : int 4
#  $ group.aes    : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
#  $ g.names      : chr [1:3] "setosa" "versicolor" "virginica"
#  $ g            : int 3
#  $ Title        : NULL
#  $ Z            : num [1:150, 1:2] -2.68 -2.71 -2.89 -2.75 -2.73 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : NULL
#  $ Vr           : num [1:4, 1:2] 0.3614 -0.0845 0.8567 0.3583 -0.6566 ...
#  $ Xhat         : num [1:150, 1:4] 5.08 4.75 4.7 4.64 5.07 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : NULL
#   ..- attr(*, "scaled:center")= Named num [1:4] -5.84 -3.06 -3.76 -1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ ax.one.unit  : num [1:4, 1:2] 0.643 -0.156 1.121 2.672 -1.169 ...
#  $ alpha.bags   :List of 3
#   ..$ setosa-0.95    : num [1:27, 1:2] -2.24 -2.22 -2.22 -2.21 -2.22 ...
#   .. ..- attr(*, "dimnames")=List of 2
#   .. .. ..$ : chr [1:27] "sr" "sr" "sr" "sr" ...
#   .. .. ..$ : NULL
#   ..$ versicolor-0.95: num [1:31, 1:2] 1.43 1.45 1.46 1.52 1.52 ...
#   .. ..- attr(*, "dimnames")=List of 2
#   .. .. ..$ : chr [1:31] "" "" "" "" ...
#   .. .. ..$ : NULL
#   ..$ virginica-0.95 : num [1:19, 1:2] 3.48 3.48 3.62 3.54 3.44 ...
#   .. ..- attr(*, "dimnames")=List of 2
#   .. .. ..$ : chr [1:19] "" "" "" "" ...
#   .. .. ..$ : NULL
#  $ alpha.bag.aes:List of 3
#   ..$ col: chr [1:3] "blue" "green" "gold"
#   ..$ lty: num [1:3] 1 1 1
#   ..$ lwd: num [1:3] 1 1 1
#  - attr(*, "class")= chr [1:2] "biplot" "PCA"

PCA(),alpha.bags() with plot()

#one alpha bag per group
biplot (iris[,1:4],group.aes=iris[,5]) |> PCA() |> alpha.bags(alpha=0.95) |> plot()

PCA(),alpha.bags() with plot()

#three alpha bags for group 1 with different line types
biplot (iris[,1:4],group.aes = iris[,5]) |> PCA() |> alpha.bags(alpha = c(0.99,0.8,0.6),
  which = 1, lty = c(1,2,3)) |> plot()

PCA(),alpha.bags() with plot()

#alpha bags for selected groups with different line types per alpha value
biplot (iris[,1:4],group.aes = iris[,5]) |> PCA() |> alpha.bags(alpha = c(0.95,0.6),
  which = c(1,2), lty = c(1,2,3)) |> plot()

PCA(),alpha.bags() with plot()

#three alpha bags per group with different line types per alpha value
biplot (iris[,1:4],group.aes = iris[,5]) |> PCA() |> alpha.bags(alpha = c(0.99,0.8,0.6)
  ,lty = c(1,2,3)) |> plot()

PCA(),alpha.bags() with plot()

#changing alpha bag colours
#changing sample colour
biplot (iris[,1:4],group.aes = iris[,5]) |> PCA() |> alpha.bags(alpha = c(0.99,0.8,0.6), 
  lty = c(1,2,3), col = c("purple","magenta","blue")) |> samples(col = "grey")|> plot()

concentration.ellipse() - arguments

# function (bp, df = 2, kappa = NULL, which = NULL, alpha = 0.95, 
#     col = ez.col, lty = 1, lwd = 1, alpha.transparency = 0.5, 
#     trace = TRUE) 
# NULL
Argument Description
bp Object of class biplot.
df Degrees of freedom to calculate kappa from alpha, with default 2.
kappa Value to construct kappa-ellipse. This will overwrite a given value of alpha and df .
which The selection of the group for ellipse construction.
alpha Size of alpha-bag, with default 0.95.

concentration.ellipse() - arguments

# function (bp, df = 2, kappa = NULL, which = NULL, alpha = 0.95, 
#     col = ez.col, lty = 1, lwd = 1, alpha.transparency = 0.5, 
#     trace = TRUE) 
# NULL
Argument Description
col Colour of ellipse.
lty Line type of ellipse.
lwd Line width of ellipse.
alpha.transparency Level of opacity. Not in current version.

concentration.ellipse() - values

# Computing 2 -ellipse for setosa 
# Computing 2 -ellipse for versicolor 
# Computing 2 -ellipse for virginica
# List of 20
#  $ X               : num [1:150, 1:4] -0.743 -0.943 -1.143 -1.243 -0.843 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#   ..- attr(*, "scaled:center")= Named num [1:4] 5.84 3.06 3.76 1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ Xcat            : NULL
#  $ raw.X           :'data.frame': 150 obs. of  4 variables:
#   ..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#   ..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#   ..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#   ..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#  $ na.action       : NULL
#  $ center          : logi TRUE
#  $ scaled          : logi FALSE
#  $ means           : Named num [1:4] 5.84 3.06 3.76 1.2
#   ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ sd              : num [1:4] 1 1 1 1
#  $ n               : int 150
#  $ p               : int 4
#  $ group.aes       : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
#  $ g.names         : chr [1:3] "setosa" "versicolor" "virginica"
#  $ g               : int 3
#  $ Title           : NULL
#  $ Z               : num [1:150, 1:2] -2.68 -2.71 -2.89 -2.75 -2.73 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : NULL
#  $ Vr              : num [1:4, 1:2] 0.3614 -0.0845 0.8567 0.3583 -0.6566 ...
#  $ Xhat            : num [1:150, 1:4] 5.08 4.75 4.7 4.64 5.07 ...
#   ..- attr(*, "dimnames")=List of 2
#   .. ..$ : chr [1:150] "1" "2" "3" "4" ...
#   .. ..$ : NULL
#   ..- attr(*, "scaled:center")= Named num [1:4] -5.84 -3.06 -3.76 -1.2
#   .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width"
#  $ ax.one.unit     : num [1:4, 1:2] 0.643 -0.156 1.121 2.672 -1.169 ...
#  $ conc.ellipses   :List of 3
#   ..$ setosa-2    : num [1:6284, 1:2] -2.92 -2.92 -2.92 -2.92 -2.92 ...
#   ..$ versicolor-2: num [1:6284, 1:2] -0.631 -0.631 -0.631 -0.631 -0.63 ...
#   ..$ virginica-2 : num [1:6284, 1:2] 0.722 0.722 0.722 0.722 0.723 ...
#  $ conc.ellipse.aes:List of 4
#   ..$ col  : chr [1:3] "blue" "green" "gold"
#   ..$ lty  : num [1:3] 1 1 1
#   ..$ lwd  : num [1:3] 1 1 1
#   ..$ alpha: num [1:3] 0.5 0.5 0.5
#  - attr(*, "class")= chr [1:2] "biplot" "PCA"

PCA(),concentration.ellipse() with plot()

#one kappa ellipse per group
biplot (iris[,1:4]) |> PCA(group.aes = iris[,5]) |> concentration.ellipse(kappa = 2) |> plot()

PCA(),concentration.ellipse() with plot()

#chaning the degrees of freedom for the kappa value calculation using alpha
biplot (iris[,1:4]) |> PCA(group.aes = iris[,5]) |> concentration.ellipse(alpha = 0.95,
  df = 2) |> plot()

PCA(),concentration.ellipse() with plot()

#three kappa ellipses per group with different line types per kappa value
biplot (iris[,1:4]) |> PCA(group.aes = iris[,5]) |> concentration.ellipse(kappa = c(2,3,4),
  lty = c(1,2,3)) |> plot()

PCA(),concentration.ellipse(), alpha.bags() with plot()

#one alpha bag and kappa ellipse per group
biplot (iris[,1:4]) |> PCA(group.aes = iris[,5]) |> concentration.ellipse(kappa = 2.45, 
  lty = 2) |> alpha.bags(alpha = 0.95) |> plot()

legend.type() - arguments

# function (bp, samples = FALSE, means = FALSE, bags = FALSE, ellipses = FALSE, 
#     new = FALSE, ...) 
# NULL
Argument Description
bp Object of class biplot.
samples Logical argument, whether legend should be printed for samples, with default “FALSE”.
means Logical argument, whether legend should be printed for means, with default “FALSE”. Not in current version.
bags Logical argument, whether legend should be printed for bags, with default “FALSE”.
new Logical argument, whether legend should appear on a new page, with default “FALSE”.

legend.type()

biplot (iris[,1:4], group.aes = iris[,5], Title = "Test biplot") |> PCA() |> 
  samples(pch = c(17,15,16)) |> legend.type(samples = TRUE) |> plot()

legend.type()

#include legend for alpha bags and kappa ellipses
biplot (iris[,1:4]) |> PCA(group.aes=iris[,5]) |> concentration.ellipse(kappa = 2.45,
  lty = 2) |> alpha.bags(alpha = 0.95) |> legend.type(bags = TRUE, ellipses = TRUE) |> plot()

legend.type()

#include legend in a separate plot
biplot (iris[,1:4]) |> PCA(group.aes=iris[,5]) |> concentration.ellipse(kappa = 2.45, 
  lty = 2) |> alpha.bags(alpha = 0.95) |> legend.type(bags = TRUE, ellipses = TRUE, 
  new = TRUE) |> plot()

Canonical Variate Analysis & biplots

  • Discriminant Analysis is a predictive method that focuses on the relationship between a data matrix of predictor variables and a response variable.
  • Its primary goal is to optimally separate different groups of objects to provide a classification rule for assigning entities of unknown origin to one of a known set of groups.
  • In Linear Discriminant Analysis (LDA), researchers attempt to find linear combinations of the predictor variables that maximise the ratio of between-class variance to within-class variance.

Canonical Variate Analysis & biplots

  • CVA is a direct application of LDA, making a CVA biplot a graphical representation of LDA.
  • The goal is to discover a mapping that can transform each column of matrix \(\pmb{X}\) into a column within a lower-dimensional space, while exploring the cluster structure of the original data.

Constructing the CVA biplot

  • The goal of CVA is to find a set of \(p\) linear combinations of \(\pmb{X}\), \({\bf{m}}_1, {\bf{m}}_2, \ldots, {\bf{m}}_p\) that maximise the between-group variation and minimise the within-group variation.
  • These linear combinations are known as canonical variates.
  • The within-group variation can be measured by the sum of squared deviations from their respective means, \({\bf{W}} = {\bf{X}}'{\bf{X}} - \bar{\bf{X}}'{\bf{N}}\bar{\bf{X}}\).
  • The between-group variation can be measured by the sum of squared deviations between the group means, \({\bf{B}} = \bar{\bf{X}}' {\bf{N}} \bar{\bf{X}}\).
  • where \({\bf{N}} = diag(n_1,n_2, \ldots, n_K)\)
  • The linear combination which optimally seperates the observations from the different groups with respect to the \(n\) samples is defined by the coefficient vector \(\pmb{m}\) which maximises the variance ratio: \(\lambda = \frac{{\bf{m}}' {\bf{B}}{\bf{m}}}{{\bf{m}}' {\bf{W}} {\bf{m}}}\).

Representing samples

The observed values of the \(p\) canonical variables for an observation with measurement vector \({\bf{x}}\) are given by the elements of the vector,

\[{\bf{y}}' = {\bf{x}}'{\bf{M}}\]

where \({\bf{M}} = {\bf{L V}}\)

\({\bf{L}} = {\bf{W}}^{-\frac{1}{2}}\)

\({\bf{V}}\) are the eigenvectors of \({\bf{W}}^{-\frac{1}{2}} {\bf{B}} {\bf{W}}^{-\frac{1}{2}}\)

In a CVA biplot, the observations belonging to different groups are represented as points. For any sample point \({\bf{x}}\), it is represented in the two-dimensional CVA biplot by the first two columns of \({\bf{x}}'{\bf{M}}\). Observations that are close together in the biplot are similar in their scores on the first two canonical variables, while observations that are far apart are dissimilar.

Representing variables

The original measured variables can then be expressed as functions of the canonical variables:

\[ {\bf{x}} = {\bf{y}}'{\bf{M}}^{-1} \]

The variables are represented as vectors, with the direction of each vector indicating the role of the variable in discriminating between the groups. We use the above to construct prediction biplot axes to represent the \(p\) variables.

Calibrated biplot axes

The vectors can be calibrated for prediction. The position of the marker point for predicting a certain value \(\mu\) on the \(i\)-th variable is given by

\[ \frac{\mu}{{\bf{e}}_i' ({\bf{M}}^{-1})'{\bf{J}}{\bf{M}}^{-1}{\bf{e}}_i} {\bf{e}}_i' ({\bf{M}}^{-1})' {\bf{J}} \]

Diabetes data

#install.packages("heplots")
#install.packages("dplyr")
library(heplots)
library(dplyr)
data(Diabetes)
Diabetes
#     relwt glufast glutest instest sspg             group
# 1    0.81      80     356     124   55            Normal
# 2    0.95      97     289     117   76            Normal
# 3    0.94     105     319     143  105            Normal
# 4    1.04      90     356     199  108            Normal
# 5    1.00      90     323     240  143            Normal
# 6    0.76      86     381     157  165            Normal
# 7    0.91     100     350     221  119            Normal
# 8    1.10      85     301     186  105            Normal
# 9    0.99      97     379     142   98            Normal
# 10   0.78      97     296     131   94            Normal
# 11   0.90      91     353     221   53            Normal
# 12   0.73      87     306     178   66            Normal
# 13   0.96      78     290     136  142            Normal
# 14   0.84      90     371     200   93            Normal
# 15   0.74      86     312     208   68            Normal
# 16   0.98      80     393     202  102            Normal
# 17   1.10      90     364     152   76            Normal
# 18   0.85      99     359     185   37            Normal
# 19   0.83      85     296     116   60            Normal
# 20   0.93      90     345     123   50            Normal
# 21   0.95      90     378     136   47            Normal
# 22   0.74      88     304     134   50            Normal
# 23   0.95      95     347     184   91            Normal
# 24   0.97      90     327     192  124            Normal
# 25   0.72      92     386     279   74            Normal
# 26   1.11      74     365     228  235            Normal
# 27   1.20      98     365     145  158            Normal
# 28   1.13     100     352     172  140            Normal
# 29   1.00      86     325     179  145            Normal
# 30   0.78      98     321     222   99            Normal
# 31   1.00      70     360     134   90            Normal
# 32   1.00      99     336     143  105            Normal
# 33   0.71      75     352     169   32            Normal
# 34   0.76      90     353     263  165            Normal
# 35   0.89      85     373     174   78            Normal
# 36   0.88      99     376     134   80            Normal
# 37   1.17     100     367     182   54            Normal
# 38   0.85      78     335     241  175            Normal
# 39   0.97     106     396     128   80            Normal
# 40   1.00      98     277     222  186            Normal
# 41   1.00     102     378     165  117            Normal
# 42   0.89      90     360     282  160            Normal
# 43   0.98      94     291      94   71            Normal
# 44   0.78      80     269     121   29            Normal
# 45   0.74      93     318      73   42            Normal
# 46   0.91      86     328     106   56            Normal
# 47   0.95      85     334     118  122            Normal
# 48   0.95      96     356     112   73            Normal
# 49   1.03      88     291     157  122            Normal
# 50   0.87      87     360     292  128            Normal
# 51   0.87      94     313     200  233            Normal
# 52   1.17      93     306     220  132            Normal
# 53   0.83      86     319     144  138            Normal
# 54   0.82      86     349     109   83            Normal
# 55   0.86      96     332     151  109            Normal
# 56   1.01      86     323     158   96            Normal
# 57   0.88      89     323      73   52            Normal
# 58   0.75      83     351      81   42            Normal
# 59   0.99      98     478     151  122 Chemical_Diabetic
# 60   1.12     100     398     122  176            Normal
# 61   1.09     110     426     117  118            Normal
# 62   1.02      88     439     208  244 Chemical_Diabetic
# 63   1.19     100     429     201  194 Chemical_Diabetic
# 64   1.06      80     333     131  136            Normal
# 65   1.20      89     472     162  257 Chemical_Diabetic
# 66   1.05      91     436     148  167 Chemical_Diabetic
# 67   1.18      96     418     130  153            Normal
# 68   1.01      95     391     137  248            Normal
# 69   0.91      82     390     375  273            Normal
# 70   0.81      84     416     146   80            Normal
# 71   1.10      90     413     344  270 Chemical_Diabetic
# 72   1.03     100     385     192  180            Normal
# 73   0.97      86     393     115   85            Normal
# 74   0.96      93     376     195  106            Normal
# 75   1.10     107     403     267  254            Normal
# 76   1.07     112     414     281  119            Normal
# 77   1.08      94     426     213  177 Chemical_Diabetic
# 78   0.95      93     364     156  159            Normal
# 79   0.74      93     391     221  103            Normal
# 80   0.84      90     356     199   59            Normal
# 81   0.89      99     398      76  108            Normal
# 82   1.11      93     393     490  259            Normal
# 83   1.19      85     425     143  204 Chemical_Diabetic
# 84   1.18      89     318      73  220            Normal
# 85   1.06      96     465     237  111 Chemical_Diabetic
# 86   0.95     111     558     748  122 Chemical_Diabetic
# 87   1.06     107     503     320  253 Chemical_Diabetic
# 88   0.98     114     540     188  211 Chemical_Diabetic
# 89   1.16     101     469     607  271 Chemical_Diabetic
# 90   1.18     108     486     297  220 Chemical_Diabetic
# 91   1.20     112     568     232  276 Chemical_Diabetic
# 92   1.08     105     527     480  233 Chemical_Diabetic
# 93   0.91     103     537     622  264 Chemical_Diabetic
# 94   1.03      99     466     287  231 Chemical_Diabetic
# 95   1.09     102     599     266  268 Chemical_Diabetic
# 96   1.05     110     477     124   60 Chemical_Diabetic
# 97   1.20     102     472     297  272 Chemical_Diabetic
# 98   1.05      96     456     326  235 Chemical_Diabetic
# 99   1.10      95     517     564  206 Chemical_Diabetic
# 100  1.12     112     503     408  300 Chemical_Diabetic
# 101  0.96     110     522     325  286 Chemical_Diabetic
# 102  1.13      92     476     433  226 Chemical_Diabetic
# 103  1.07     104     472     180  239 Chemical_Diabetic
# 104  1.10      75     455     392  242 Chemical_Diabetic
# 105  0.94      92     442     109  157 Chemical_Diabetic
# 106  1.12      92     541     313  267 Chemical_Diabetic
# 107  0.88      92     580     132  155 Chemical_Diabetic
# 108  0.93      93     472     285  194 Chemical_Diabetic
# 109  1.16     112     562     139  198 Chemical_Diabetic
# 110  0.94      88     423     212  156 Chemical_Diabetic
# 111  0.91     114     643     155  100 Chemical_Diabetic
# 112  0.83     103     533     120  135 Chemical_Diabetic
# 113  0.92     300    1468      28  455    Overt_Diabetic
# 114  0.86     303    1487      23  327    Overt_Diabetic
# 115  0.85     125     714     232  279    Overt_Diabetic
# 116  0.83     280    1470      54  382    Overt_Diabetic
# 117  0.85     216    1113      81  378    Overt_Diabetic
# 118  1.06     190     972      87  374    Overt_Diabetic
# 119  1.06     151     854      76  260    Overt_Diabetic
# 120  0.92     303    1364      42  346    Overt_Diabetic
# 121  1.20     173     832     102  319    Overt_Diabetic
# 122  1.04     203     967     138  351    Overt_Diabetic
# 123  1.16     195     920     160  357    Overt_Diabetic
# 124  1.08     140     613     131  248    Overt_Diabetic
# 125  0.95     151     857     145  324    Overt_Diabetic
# 126  0.86     275    1373      45  300    Overt_Diabetic
# 127  0.90     260    1133     118  300    Overt_Diabetic
# 128  0.97     149     849     159  310    Overt_Diabetic
# 129  1.16     233    1183      73  458    Overt_Diabetic
# 130  1.12     146     847     103  339    Overt_Diabetic
# 131  1.07     124     538     460  320    Overt_Diabetic
# 132  0.93     213    1001      42  297    Overt_Diabetic
# 133  0.85     330    1520      13  303    Overt_Diabetic
# 134  0.81     123     557     130  152    Overt_Diabetic
# 135  0.98     130     670      44  167    Overt_Diabetic
# 136  1.01     120     636     314  220    Overt_Diabetic
# 137  1.19     138     741     219  209    Overt_Diabetic
# 138  1.04     188     958     100  351    Overt_Diabetic
# 139  1.06     339    1354      10  450    Overt_Diabetic
# 140  1.03     265    1263      83  413    Overt_Diabetic
# 141  1.05     353    1428      41  480    Overt_Diabetic
# 142  0.91     180     923      77  150    Overt_Diabetic
# 143  0.90     213    1025      29  209    Overt_Diabetic
# 144  1.11     328    1246     124  442    Overt_Diabetic
# 145  0.74     346    1568      15  253    Overt_Diabetic

CVA Biplot

biplot(Diabetes[,1:5], group.aes = Diabetes[,6]) |> CVA() |> axes(label.dist = 0.2) |>  plot()

CVA() - arguments

# function (bp, dim.biplot = c(2, 1, 3), e.vects = 1:ncol(bp$X), 
#     group.aes = bp$group.aes, weightedCVA = "weighted", ...) 
# NULL
Argument Description
bp Object of class biplot obtained from preceding function biplot().
dim.biplot Dimension of the biplot. Only values 1, 2 and 3 are accepted, with default 2. Only 2D in current version.
e.vects Which eigenvectors (principal components) to extract, with default 1:dim.biplot. Only 2D in current version.

CVA() - arguments

# function (bp, dim.biplot = c(2, 1, 3), e.vects = 1:ncol(bp$X), 
#     group.aes = bp$group.aes, weightedCVA = "weighted", ...) 
# NULL
Argument Description
group.aes Vector of the same length as the number of rows in the data matrix for differentiated aesthetics for samples.
weightedCVA The default is “weighted”, specifying a weighted CVA to be performed. Other possible values are “unweightedI” and “unweightedCent”. When group sizes are similar, weighting will have a minimal effect.

CVA() with plot() - groups

biplot(Diabetes[,3:5], group.aes = Diabetes[,6]) |> CVA() |>  axes(tick.label.pos
  = c(1,2,1,4), tick.label.offset = c(0.7,0.2,0.4,1), label.dist = 0.2) |> plot()

Thank you

We hope this is the start of many EZ biplot constructions!













Slide inspiration: Dianne Cook MDAG 2022